-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Added numeric ranking Performance Tests #888
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ranking.TrainTest_Multiclass_MSLRWeb10K_Ranking_FastTree: Job-ZGJLIB(Toolchain=netcoreapp2.1, MaxIterationCount=20, WarmupCount=1)
Mean = 72.7866 s, StdErr = 0.5849 s (0.80%); N = 20, StdDev = 2.6156 s
Min = 68.6174 s, Q1 = 70.0802 s, Median = 73.3494 s, Q3 = 74.9717 s, Max = 77.5199 s
IQR = 4.8915 s, LowerFence = 62.7429 s, UpperFence = 82.3089 s
ConfidenceInterval = [70.5153 s; 75.0579 s] (CI 99.9%), Margin = 2.2713 s (3.12% of Mean)
Skewness = -0.09, Kurtosis = 1.66, MValue = 3.14
-------------------- Histogram --------------------
[68.616 s ; 70.533 s) | @@@@@@
[70.533 s ; 72.219 s) | @@
[72.219 s ; 74.846 s) | @@@@@@@
[74.846 s ; 76.379 s) | @@@@
[76.379 s ; 78.363 s) | @
---------------------------------------------------
Ranking.TrainTest_Multiclass_MSLRWeb10K_Ranking_LightGBM: Job-ZGJLIB(Toolchain=netcoreapp2.1, MaxIterationCount=20, WarmupCount=1)
Mean = 65.4654 s, StdErr = 0.9955 s (1.52%); N = 20, StdDev = 4.4522 s
Min = 58.3025 s, Q1 = 62.0859 s, Median = 65.7175 s, Q3 = 68.5847 s, Max = 75.8618 s
IQR = 6.4988 s, LowerFence = 52.3377 s, UpperFence = 78.3329 s
ConfidenceInterval = [61.5993 s; 69.3315 s] (CI 99.9%), Margin = 3.8661 s (5.91% of Mean)
Skewness = 0.1, Kurtosis = 2.64, MValue = 3.25
-------------------- Histogram --------------------
[58.224 s ; 61.721 s) | @@@@@
[61.721 s ; 64.643 s) | @
[64.643 s ; 67.513 s) | @@@@@@@@
[67.513 s ; 70.412 s) | @@@@@
[70.412 s ; 74.427 s) |
[74.427 s ; 77.297 s) | @
---------------------------------------------------
Ranking.Test_Multiclass_MSLRWeb10K_Ranking_FastTree: Job-ZGJLIB(Toolchain=netcoreapp2.1, MaxIterationCount=20, WarmupCount=1)
Mean = 4.5715 s, StdErr = 0.0079 s (0.17%); N = 14, StdDev = 0.0295 s
Min = 4.5317 s, Q1 = 4.5526 s, Median = 4.5653 s, Q3 = 4.5778 s, Max = 4.6321 s
IQR = 0.0252 s, LowerFence = 4.5147 s, UpperFence = 4.6157 s
ConfidenceInterval = [4.5382 s; 4.6047 s] (CI 99.9%), Margin = 0.0333 s (0.73% of Mean)
Skewness = 0.67, Kurtosis = 2.39, MValue = 2
-------------------- Histogram --------------------
[4.524 s ; 4.643 s) | @@@@@@@@@@@@@@
---------------------------------------------------
Toolchain=netcoreapp2.1 MaxIterationCount=20 WarmupCount=1
|
build.proj
Outdated
</ItemGroup> | ||
|
||
<ItemGroup Condition="'$(IncludeBenchmarkData)' == 'true'" > | ||
<TestFile Include="$(MSBuildThisFileDirectory)/test/data/external/WikiDetoxAnnotated160kRows.tsv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The duplication here could be simplified using MSBuild. Something along the lines of:
<ItemGroup>
<TlcResourceFile Include="WikiDetoxAnnotated160kRows.tsv" />
<TlcResourceFile Include="MSLRWeb10KTrain3.6MRows.tsv" />
<TlcResourceFile Include="MSLRWeb10KValidate1.2MRows.tsv" />
<TlcResourceFile Include="MSLRWeb10KTest1.2MRows.tsv" />
<TlcResourceFile Update="@(TlcResourceFile)">
<Url>http://aka.ms/tlc-resources/benchmarks/%(Identity)</Url>
<DestinationFile>$(MSBuildThisFileDirectory)test/data/external/%(Identity)</DestinationFile>
</TlcResourceFile>
<TestFile Include="@(TlcResourceFile->'$(MSBuildThisFileDirectory)/test/data/external/%(Identity)')" />
</ItemGroup>
I'm not 100% sure it is better, but it reduces the number of times these URLs need to be copied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's batching, I think it would only work within a <Target>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep dan is right. its not working in this case. any other suggestion here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It works, you just need to have the right syntax. I've updated the above with actual MSBuild code that works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay thanks :)
BenchmarkDotNet=v0.11.1, OS=Windows 10.0.17134.228 (1803/April2018Update/Redstone4)
Intel Xeon CPU E5-1650 v4 3.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET Core SDK=2.1.400
[Host] : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
Job-QFXMOR : .NET Core 2.1.2 (CoreCLR 4.6.26628.05, CoreFX 4.6.26629.01), 64bit RyuJIT
Toolchain=netcoreapp2.1 MaxIterationCount=20 WarmupCount=1
|
Include="..\data\external\WikiDetoxAnnotated160kRows.tsv" | ||
Link="external\WikiDetoxAnnotated160kRows.tsv"> | ||
|
||
<TlcResourceFile Update="@(TlcResourceFile)"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TlcResourceFile [](start = 31, length = 15)
call it something non TLC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but the reason I chose the name originally is because of the URL:
Can we change this URL? Or make a new aka.ms URL pointing to the same location with a different name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Let's wait to merge until the MSLR-WEB10K dataset is available in the CDN.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add the citation to the MSLR-WEB10K dataset. |
@@ -11,6 +11,7 @@ | |||
|
|||
namespace Microsoft.ML.Benchmarks | |||
{ | |||
[WarmupCount(8)] // It helps to reduce the standard deviation of these tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The normal user is unlikely to pre-train a model 8 times before training their model. This will be representative of the steady state reached when a model is retrained many times, but not very representative of the normal user's interaction w/ ML.net.
Do we know what's causing the time difference between the first run and the later runs? The first run is most representative of what a normal user will experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adamsitnik can you give us a better view here ? Increasing the warmup iterations leads to reducing the standard deviation here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinormont do u want me to reduce it ?
cc @danmosemsft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you give us a better view here ?
@Anipik Unfortunately, it's not that simple. To find out why given benchmark behaves differently for different warmup counts we would have to profile it. It could be that OS gets warmed up and reading the input files becomes faster or anything like that.
The normal user is unlikely to pre-train a model 8 times before training their model.
@justinormont I agree. In that case we should set the WarmupCount to 0, IterationCount to 1 and LaunchCount to 20. Which means that BenchmarkDotNet is going to start a new process 20 times and each time execute the benchmark only once, without any warmup (the real use case) and just exit the process.
Edit: we should most probably have two configs: one for training benchmarks (no warmups) and one for prediction benchmarks (the one we have today)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will revert warmupCount to 1 for this PR to get merged, we can later follow up with 2 config files as adam suggested
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, though there's a couple minor things:
- Check resource download location:
Added numeric ranking Performance Tests #888 (comment) - For security, use https:
Added numeric ranking Performance Tests #888 (comment) - Reduce extra newlines in the citation: (very minor)
Added numeric ranking Performance Tests #888 (comment)
@justinormont cam you take a look here ? |
@justinormont can you merge this ? I don't have the write access to the repo ? |
I'm going to close & re-open this pull request to notify the CI to re-check this PR. |
The CI test are failing. Though GitHub says 'in progress', it will say failed soon. The Error:
|
Looking into it |
It's a well-known issue that the |
Related issue for the Wine dataset: #889 Hot linking to a UCI dataset Currently, the UCI web server is non-responsive, causing the dataset to not download, and the test to fail. |
@justinormont the ci is green, can we go ahead and merge this one ? |
@Anipik, the merge is waiting on a merge conflict, can you look in to it? |
@justinormont i resolved the conflict |
Missing semicolon is causing the build the fail: ``` 2018-09-19T04:41:39.2181028Z Datasets.cs(171,10): error CS1002: ; expected [/__w/3/s/test/Microsoft.ML.TestFramework/Microsoft.ML.TestFramework.csproj] 2018-09-19T04:41:39.7812509Z Microsoft.ML.StandardLearners -> /__w/3/s/bin/AnyCPU.Debug/Microsoft.ML.StandardLearners/netstandard2.0/Microsoft.ML.StandardLearners.dll 2018-09-19T04:41:40.7120753Z Microsoft.ML.HalLearners -> /__w/3/s/bin/AnyCPU.Debug/Microsoft.ML.HalLearners/netstandard2.0/Microsoft.ML.HalLearners.dll 2018-09-19T04:41:40.8804119Z Microsoft.ML.Ensemble -> /__w/3/s/bin/AnyCPU.Debug/Microsoft.ML.Ensemble/netstandard2.0/Microsoft.ML.Ensemble.dll 2018-09-19T04:41:40.9555420Z Microsoft.ML.LightGBM -> /__w/3/s/bin/AnyCPU.Debug/Microsoft.ML.LightGBM/netstandard2.0/Microsoft.ML.LightGBM.dll 2018-09-19T04:41:41.5610322Z Microsoft.ML.PipelineInference -> /__w/3/s/bin/AnyCPU.Debug/Microsoft.ML.PipelineInference/netstandard2.0/Microsoft.ML.PipelineInference.dll 2018-09-19T04:41:42.4887819Z Microsoft.ML.Console -> /__w/3/s/bin/AnyCPU.Debug/Microsoft.ML.Console/netcoreapp2.0/MML.dll 2018-09-19T04:41:45.7637388Z Microsoft.ML.FSharp.Tests -> /__w/3/s/bin/AnyCPU.Debug/Microsoft.ML.FSharp.Tests/netcoreapp2.1/Microsoft.ML.FSharp.Tests.dll 2018-09-19T04:41:45.7926386Z /__w/3/s/dir.traversal.targets(25,5): error : Build failed. See earlier errors. [/__w/3/s/build.proj] 2018-09-19T04:41:45.8133725Z 2018-09-19T04:41:45.8152732Z Build FAILED. ```
Thanks @Anipik for all the unexpected work needed in this pull request. |
Added benchmarking performance tests for Numeric ranking.
cc @justinormont @sfilipi @danmosemsft @eerhardt @shauheen